A fuzzy semi-supervised support vector machine approach to hypertext categorization

نویسنده

  • Houda Benbrahim
چکیده

Hypertext/text domains are characterized by several tens or hundreds of thousands of features. This represents a challenge for supervised learning algorithms which have to learn accurate classifiers using a small set of available training examples. In this paper, a fuzzy semi-supervised support vector machines (FSS-SVM) algorithm is proposed. It tries to overcome the need for a large labelled training set. For this, it uses both labelled and unlabelled data for training. It also modulates the effect of the unlabelled data in the learning process. Empirical evaluations with two real-world hypertext datasets showed that, by additionally using unlabelled data, FSS-SVM requires less labelled training data than its supervised version, support vector machines, to achieve the same level of classification performance. Also, the incorporated fuzzy membership values of the unlabelled training patterns in the learning process have positively influenced the classification performance in comparison with its crisp variant.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Fuzzy Semi-Supervised Support Vector Machines Approach to Hypertext Categorization

Hypertext/text domains are characterized by several tens or hundreds of thousands of features. This represents a challenge for supervised learning algorithms which have to learn accurate classifiers using a small set of available training examples. In this paper, a fuzzy semi-supervised support vector machines (FSS-SVM) algorithm is proposed. It tries to overcome the need for a large labelled t...

متن کامل

Approach to Hypertext Categorization

Hypertext/text domains are characterized by several tens or hundreds of thousands of features. This represents a challenge for supervised learning algorithms which have to learn accurate classifiers using a small set of available training examples. In this paper, a fuzzy semi-supervised support vector machines (FSS-SVM) algorithm is proposed. It tries to overcome the need for a large labelled t...

متن کامل

Pattern classification and clustering: A review of partially supervised learning approaches

The paper categorizes and reviews the state-of-the-art approaches to the partially supervised learning (PSL) task. Special emphasis is put on the fields of pattern recognition and clustering involving partially (or, weakly) labeled data sets. The major instances of PSL techniques are categorized into the following taxonomy: (i) active learning for training set design, where the learning algorit...

متن کامل

Border sensitive fuzzy vector quantization in semi-supervised learning

Abstract. We propose a semi-supervised fuzzy vector quantization method for the classification of incompletely labeled data. Since information contained within the structure of the data set should not be neglected, our method considers the whole data set during the learning process. In difference to known methods our approach uses neighborhood cooperativeness for stable prototype learning known...

متن کامل

Improving Relevance Feedback in Image Retrieval by Incorporating Unlabelled Images

In content-base image retrieval, relevance feedback (RF) schemes based on support vector machine (SVM) have been widely used to narrow the semantic gap between low-level visual features and high-level human perception. However, the performance of image retrieval with SVM active learning is known to be poor when the training data is insufficient. In this paper, the problem is solved by incorpora...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008